Skip to content

perf(toolpath-desktop): perf tracer + buildTree memo (net effect: faster preview open)#54

Merged
eliothedeman merged 4 commits intomainfrom
eliot/goofy-shirley-9e2fbc
Apr 23, 2026
Merged

perf(toolpath-desktop): perf tracer + buildTree memo (net effect: faster preview open)#54
eliothedeman merged 4 commits intomainfrom
eliot/goofy-shirley-9e2fbc

Conversation

@eliothedeman
Copy link
Copy Markdown
Collaborator

Summary

Measuring before optimising. Added a click→derive→render perf tracer that records timestamps at each phase and can show them as an always-visible overlay; used it to find that the real cost of opening a trace on a large Claude session was Svelte render work (not Rust derive), then landed a tiny WeakMap memo that removes a duplicate buildTree computation.

End-to-end, for a 1737-step / 609-turn Claude session: 593ms → 472ms.

Backend is unchanged vs. main. An earlier commit on the branch prototyped a two-tier pre-derive cache (memory + disk); after the measurement showed derive was only ~40% of the click latency and the render work dominated, the cache was reverted as added complexity without a proportional win. The revert is included here so the branch tells the full story. Net backend diff vs. main: zero.

What's in here

  • Perf tracer (frontend/src/lib/perf.svelte.ts, PerfOverlay.svelte): perfStart / perfMark / perfEnd record checkpoints; the store marks dispatch, invoke-start, invoke-end, model-updated; Preview marks preview-mounted, viz-rendered / dom-painted; buildTree and flattenChatHead mark their own timings with step/turn counts. Summary logs to the devtools console always; set localStorage.perf = "1" and reload to also show a phase-bar overlay in the bottom-right.
  • WeakMap memos in tree.ts for buildTree and flattenChatHead: both StepTree and ChatView independently called buildTree(doc) from their own $derived blocks, doubling the normalize + flatten cost on every preview open. Keyed on doc / norm identity so old entries get GC'd when a new derive replaces the doc.
  • Deferred state writes in perf.svelte.ts via queueMicrotask: fixes state_unsafe_mutation when perfMark is called from inside a $derived (ChatView's turns reads buildTree + flattenChatHead, which now call perfMark).

Out of scope

  • Making the Rust derive faster. It's now the dominant piece (335ms of the 472ms) — filed as #53 to profile read_conversation + derive_path and consider streaming.

Test plan

  • cargo test -p toolpath-desktop — 17 passing (unchanged vs. main)
  • cargo clippy -p toolpath-desktop --all-targets -- -D warnings — clean
  • bun run check + bun run build — svelte-check clean, Vite build succeeds
  • Manual: click Select → on a long Claude session; perf log prints to console; overlay appears when localStorage.perf = "1"; no state_unsafe_mutation runtime error.
  • Reviewer: try toggling on/off with localStorage.perf; reload; open a couple sessions to confirm the memo hits on the second component (look for buildTree cache-hit in the console).

Clicking a session in Quick View or the Browse "Select →" button used
to run derive synchronously on the UI path, producing a noticeable
pause. This adds an in-memory + on-disk cache (`src/cache.rs`,
`TraceCache`) that the tray poller warms after every 30s scan for each
recent claude/pi session. Both the popover's `tray_open_trace` and the
main-window `derive_claude` / `derive_pi` IPC commands route through
the same cache via `derive_claude_impl` / `derive_pi_impl`, so cached
hits short-circuit before any derive work.

- Memory tier: `HashMap`, 32-entry LRU, rejected when a warmer is
  already in flight for the same key.
- Disk tier: `<temp_dir>/toolpath-desktop/trace-cache/<fnv1a64>.json`,
  atomic writes (.tmp + rename), 200-entry cap pruned oldest-first on
  startup, corrupt files deleted on read.
- Freshness: keyed on the source session's `last_activity`. Warmer
  passes overwrite stale entries; user-initiated derives backfill with
  an empty timestamp and get replaced on the next poll.

Tests rise from 17 to 32 unit tests (new cache-tier tests, cache-hit
short-circuits for both providers, prewarm provider routing).
To isolate where perceived click latency actually lives — Rust derive
vs. Svelte/dagre render — the store and Preview now emit perf marks
at every checkpoint in the flow:

  dispatch → invoke-start → invoke-end → model-updated
  → preview-mounted → viz-rendered (or dom-painted in chat mode)

The popover's `trace:opened` event path also starts its own trace so
the overlay can show post-derive render time in isolation (Rust has
already finished).

Every completed trace logs a phase-delta summary to the devtools
console. To also show the phase-bar overlay in the bottom-right of
the main window, set `localStorage.perf = "1"` and reload.

Scope is read-only: no behavioural change to derive or caching.
The perf tracer showed the real bottleneck sits between model-update
and component-mount (~205ms of Svelte render work) rather than in the
Rust derive (~80ms), so the two-tier cache added in the earlier
commit was optimising the wrong thing. Stripping it removes a lot of
complexity (cache.rs, disk persistence, prewarm threading, in-flight
slots, LRU eviction) that wasn't buying the user anything.

Kept: the perf tracer, the overlay, the `buildTree` /
`flattenChatHead` marks — these are what lets us now see exactly
where the remaining time goes.

Also fix `state_unsafe_mutation` thrown when `perfMark` is called
from inside a `$derived` (which happens when ChatView's `turns`
derivation runs `buildTree` + `flattenChatHead`): defer `perf.latest`
writes to a microtask so mutation never happens during derivation.
…tity

Both StepTree.svelte and ChatView.svelte independently called
`buildTree(doc)` from their own `$derived` blocks, so every preview
open paid the normalize + flatten cost twice. Add WeakMap memos keyed
by `doc` / `norm` identity — callers always pass `store.m.preview.doc`
which is a stable reference across renders, and the WeakMap lets the
old entry get collected when a new derive replaces the doc.

Measured end-to-end click-to-painted on a 1737-step / 609-turn Claude
session: 593ms → 472ms (buildTree dedupe + JIT-warm flattenChatHead).
@eliothedeman eliothedeman merged commit 9240f82 into main Apr 23, 2026
2 checks passed
@github-actions
Copy link
Copy Markdown

🔍 Preview deployed: https://1f2fb1a4.toolpath.pages.dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant